Reverberant Speech Recognition Combining Deep Neural Networks and Deep Autoencoders
نویسندگان
چکیده
We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder for enhancing the speech feature parameters, and the recognition is performed using a DNN-HMM acoustic models trained on multi-condition data. The system was evaluated through the ASR task in Chime Challenge 2014. The DNN-HMM system trained on the multi-condition training set achieved a conspicuously higher word accuracy compared to the MLLR-adapted GMM-HMM system trained on the same data. Furthermore, feature enhancement with the deep autoencoder contributed to the improvement of recognition accuracy especially in the more adverse conditions. When the DNN-HMM was used without the deep autoencoder front end, it resulted in a better performance than the non-adapted GMM-HMM system, but was not as good as the adapted GMM-HMM system. However, it outperformed the adapted GMM-HMM system when combined with the deep autoencoder.
منابع مشابه
Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature
We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognitio...
متن کاملCombining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملAdvances in Deep Learning
Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014